Provenance and Probabilities in Relational Databases: From Theory to Practice
نویسنده
چکیده
We review the basics of data provenance in relational databases. We describe different provenance formalisms, from Boolean provenance to provenance semirings and beyond, that can be used for a wide variety of purposes, to obtain additional information on the output of a query. We discuss representation systems for data provenance, circuits in particular, with a focus on practical implementation. Finally, we explain how provenance is practically used for probabilistic query evaluation in proba-
منابع مشابه
Provenance Traces
Provenance is information about the origin, derivation, ownership, or history of an object. It has recently been studied extensively in scientific databases and other settings due to its importance in helping scientists judge data validity, quality and integrity. However, most models of provenance have been stated as ad hoc definitions motivated by informal concepts such as “comes from”, “influ...
متن کاملImprov: Flexible Data Provenance for Relational Databases
Curated databases, which consist of data extracted from original sources, printed articles, and other databases, are a valuable source of data for scientists. However, as curated databases aggregate information from multiple sources, the origin of the data elements can be lost. Because of this, curated databases often provide support for data annotations, which are pieces of extra information a...
متن کاملWhy and Where: A Characterization of Data Provenance
With the proliferation of database views and curated databases, the issue of data provenance { where a piece of data came from and the process by which it arrived in the database { is becoming increasingly important, especially in scienti c databases where understanding provenance is crucial to the accuracy and currency of data. In this paper we describe an approach to computing provenance when...
متن کاملA Graph Model of Data and Workflow Provenance
Provenance has been studied extensively in both database and workflow management systems, so far with little convergence of definitions or models. Provenance in databases has generally been defined for relational or complex object data, by propagating fine-grained annotations or algebraic expressions from the input to the output. This kind of provenance has been found useful in other areas of c...
متن کاملSymmetry in Probabilistic Databases
Researchers in databases, AI, and machine learning, have all proposed representations of probability distributions over relational databases (possible worlds). In a tuple-independent probabilistic database, the possible worlds all have distinct probabilities, because the tuple probabilities are distinct. In AI and machine learning, however, one typically learns highly symmetric distributions, w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017